-
Notifications
You must be signed in to change notification settings - Fork 292
Follow ups from the recent picture caching optimizations. #3455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
r? @kvark There is still one more known picture caching issue I'm working on locally, but these changes are reasonable to review / land while I continue work on that. Try run (pic caching disabled) looks good: Try run (pic caching enabled) looks good (one new pass in R6): |
(some of these changes are a bit subtle - let me know if they don't make sense and we can discuss) |
This patch contains a number of fixes and improvements to the picture caching changes that landed recently to improve the general performance of picture caching. Specifically: * Support setting scissor rect for the dirty rect that is being drawn. This is a performance win, and also simplifies the batching logic to not have to adjust the z_id for the tiles that are being drawn. * Support alpha batch containers having more than one batch list. Each batch list supports an optional scissor rect, and a list of blits to run after drawing that batch list. This ensures that items batched after the cacheable content are not drawn into the cache tiles. * The pending tile blits no longer need to be stored in the render task or surface information struct. Instead, they are retrieved directly from the picture's tile cache struct during batching. * Include the local clip rect of the picture when drawing tiles, to ensure that the tiles don't write to the z-buffer where the scroll bars for a content frame is (they typically come earlier in the display list than the content, relying on clipping rather than render order). * Include the tile relative position of clip vertices - this ensures that tiles are correctly invalidated in cases where only the relative position of a clip node changes between frames (there are a couple of reftests that verify this). * Handle the case of a zero-sized clip mask correctly, to avoid trying to allocate a zero sized texture if only one clip task exists in a pass. * Skip pushing / popping clip node collector for tile cache surfaces.
74973fa
to
de04075
Compare
This fixes bug 1516695 and bug 1516786. |
We want to get this landed as soon as possible, so I got @bholley to take a quick sanity check and we'll get @kvark to do a detailed follow up review next week. Merging now since both try runs above are green. @bors-servo r=bholley |
📌 Commit de04075 has been approved by |
Follow ups from the recent picture caching optimizations. This patch contains a number of fixes and improvements to the picture caching changes that landed recently to improve the general performance of picture caching. Specifically: * Support setting scissor rect for the dirty rect that is being drawn. This is a performance win, and also simplifies the batching logic to not have to adjust the z_id for the tiles that are being drawn. * Support alpha batch containers having more than one batch list. Each batch list supports an optional scissor rect, and a list of blits to run after drawing that batch list. This ensures that items batched after the cacheable content are not drawn into the cache tiles. * The pending tile blits no longer need to be stored in the render task or surface information struct. Instead, they are retrieved directly from the picture's tile cache struct during batching. * Include the local clip rect of the picture when drawing tiles, to ensure that the tiles don't write to the z-buffer where the scroll bars for a content frame is (they typically come earlier in the display list than the content, relying on clipping rather than render order). * Include the tile relative position of clip vertices - this ensures that tiles are correctly invalidated in cases where only the relative position of a clip node changes between frames (there are a couple of reftests that verify this). * Handle the case of a zero-sized clip mask correctly, to avoid trying to allocate a zero sized texture if only one clip task exists in a pass. * Skip pushing / popping clip node collector for tile cache surfaces. <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/3455) <!-- Reviewable:end -->
☀️ Test successful - status-appveyor, status-taskcluster |
…4e47dc2ef998 (WR PR #3455). r=kats servo/webrender#3455 Differential Revision: https://phabricator.services.mozilla.com/D15632 --HG-- extra : moz-landing-system : lando
…4e47dc2ef998 (WR PR #3455). r=kats servo/webrender#3455 Differential Revision: https://phabricator.services.mozilla.com/D15632
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got some post-land comments
Reviewed 7 of 7 files at r1.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @gw3583)
webrender/src/batch.rs, line 401 at r1 (raw file):
pub fn is_empty(&self) -> bool { self.opaque_batches.is_empty() &&
if there is a blit requested, wouldn't we still need to do something (e.g. clear the target)?
perhaps, all we need is assert!(self.tile_blits.is_empty())
if there are no batches
webrender/src/batch.rs, line 453 at r1 (raw file):
/// Encapsulates the logic of building batches for items that are blended. pub struct AlphaBatchBuilder { pub batch_lists: Vec<BatchList>,
from the ergonomics point of view, it would be cleaner to have:
pub batch_lists: Vec<BatchList>,
pub current_batch_list: BatchList,
This would:
- avoid excessive
unwrap()
calls andcurrent_batch_list()
helper - encode at the type level the property of always having at least one batch list
webrender/src/batch.rs, line 485 at r1 (raw file):
tile_blits: Vec<TileBlit>, ) { let scissor_rect = match (scissor_rect, self.scissor_rect) {
it seems a bit unclear to me that every other (non-initial) batch list gets intersected with the scissor rect of the first list (which is the same as self.scissor_rect). What makes the first batch list special?
webrender/src/batch.rs, line 489 at r1 (raw file):
Some(rect0.intersection(&rect1).unwrap_or(DeviceIntRect::zero())) } (Some(rect0), None) => Some(rect0),
could probably just be (a, b) => a.or(b),
webrender/src/batch.rs, line 505 at r1 (raw file):
} fn can_merge(&self) -> bool {
it's not clear what this means: can merge into something, or can merge something into this one
webrender/src/batch.rs, line 512 at r1 (raw file):
pub fn build( mut self, batch_containers: &mut Vec<AlphaBatchContainer>,
nit: would be good to use Self
here for clarity
webrender/src/batch.rs, line 1124 at r1 (raw file):
// main picture primitive list, and draw them first. if let Some(ref dirty_region) = tile_cache.dirty_region { let mut tile_blits = Vec::new();
nit: could use with_capacity()
webrender/src/batch.rs, line 1132 at r1 (raw file):
dest_offset: blit.dest_offset, size: blit.size, target: blit.target.clone(),
nit: could just do .. blit.clone()
at the end of initialization instead of the 3 fields
webrender/src/batch.rs, line 1140 at r1 (raw file):
} self.push_new_batch_list(
it's somewhat unfortunate that the code is so statefull here. Ideally we'd have something more functional:
let mut batch_list = BatchList::new(...);
batch_list.add_picture(...);
self.batch_list.push(batch_list);
webrender/src/picture.rs, line 172 at r1 (raw file):
/// ensures that if a clip node is supplied but has a different /// transform between frames that the tile is invalidated. clip_vertices: ComparableVec<VectorKey>,
in the long term, we need to find a way to ensure at some level that everything a tile depends on is a part of the tile descriptor. Currently this is only held by the author reasoning, and the errors are discovered based on bugs that pop up, which isn't ideal.
webrender/src/renderer.rs, line 3305 at r1 (raw file):
//Note: depth equality is needed for split planes self.device.set_depth_func(DepthFunction::LessEqual); self.device.enable_depth();
we don't have a check for target.needs_depth()
any more. Can we at least assert on it?
webrender/src/prim_store/mod.rs, line 1734 at r1 (raw file):
// Build the dirty region(s) for this tile cache. pic.local_clip_rect = tile_cache.post_update(
Could you clarify this part? It appears to be overwriting the local_clip_rect, and the result here is taken from inverse projecting, which means it's very conservative and sometimes can be +- infinity
…4e47dc2ef998 (WR PR #3455). r=kats servo/webrender#3455 Differential Revision: https://phabricator.services.mozilla.com/D15632 UltraBlame original commit: 37c0d21de4a5f56800548d867c8ebd5a9d92dc98
…4e47dc2ef998 (WR PR #3455). r=kats servo/webrender#3455 Differential Revision: https://phabricator.services.mozilla.com/D15632 UltraBlame original commit: 37c0d21de4a5f56800548d867c8ebd5a9d92dc98
…4e47dc2ef998 (WR PR #3455). r=kats servo/webrender#3455 Differential Revision: https://phabricator.services.mozilla.com/D15632 UltraBlame original commit: 37c0d21de4a5f56800548d867c8ebd5a9d92dc98
This patch contains a number of fixes and improvements to
the picture caching changes that landed recently to improve
the general performance of picture caching. Specifically:
being drawn. This is a performance win, and also simplifies
the batching logic to not have to adjust the z_id for the
tiles that are being drawn.
Each batch list supports an optional scissor rect, and a list
of blits to run after drawing that batch list. This ensures
that items batched after the cacheable content are not drawn
into the cache tiles.
task or surface information struct. Instead, they are retrieved
directly from the picture's tile cache struct during batching.
to ensure that the tiles don't write to the z-buffer where the
scroll bars for a content frame is (they typically come earlier
in the display list than the content, relying on clipping rather
than render order).
that tiles are correctly invalidated in cases where only the
relative position of a clip node changes between frames (there are
a couple of reftests that verify this).
trying to allocate a zero sized texture if only one clip task
exists in a pass.
This change is